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Using  che  proposed  heuristics  end  decoaposlng  the  systea  structure  into 
suitsbly  slsed  Modules  allows  staple  end  intuitive  test  procedures  to  be 
developed  with  e  alnlaua  aaount  of  coaputstional  effort. 
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0.  Abstract 

When  a  ays tea  falls  It  aay  not  be  obvious  which  coaponents  are  at  fault. 
Locating  faulty  coaponents  to  be  replaced  aay  require  a  series  of  Inspections 
each  of  which  reveals  the  state  (functlonlng/f ailed)  of  one  of  the  coaponents. 
The  order  In  which  coaponents  are  lnapected  and  replaced  can  greatly  affect  the 
cost  to  restore  the  systea  to  an  operating  condition.  This  paper  Investigates 
Inspection  sequences  for  coaplex  coherent  systeas.  ^ - 


1.  Introduction 

When  a  systea  falls  It  Is  seldoa  obvious  which  coaponents  are  at  fault. 
Locating  the  faulty  parts  aay  require  a  sequence  of  tests  In  each  of  which  the 
state  (functioning/failed)  of  one  of  the  coaponents  is  Identified.  If  the 
state  of  every  coaponent  aust  be  deterained,  then  the  order  In  which  coaponents 
are  tested  aay  not  aatter  auch.  But  often  testing  stops  as  soon  as  the  first 
failed  coaponent  Is  found.  Or  testing  aay  continue  until  a  failed  coaponent  is 
located  which,  when  replaced,  fixes  the  systea.  (Such  a  coaponent  failure  will 
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b«  called  a  critical  failure*)  In  these  situations  tha  average  number  of  tests 
carried  out  depends  upon  the  order  In  which  components  are  tested* 


Butterworth  [1972]  hes  developed  two  fault-testing  aodels  for  k-out-of-n 
systems.  In  both  models  components  fail  Independently  of  one  another,  each 
test  costs  a  given  amount,  and  test  procedures  are  compared  by  their  respective 
total  expected  costs.  In  the  first  model,  the  system  Is  failed  and  test 
procedures  mast  determine  the  state  of  each  component.  In  the  second  model 
test  procedures  must  determine  the  system  state.  Following  Butterworth's 
nomenclature,  a  sequential  test  procedure  prespecifies  the  sequence  In  which 
components  are  tested;  in  a  nonsequential  procedure,  the  outcomes  of  early 
tests  may  dictate  the  order  of  later  tests.  Sequential  test  procedures  have  an 
Important  advantage  over  nonsequential  procedures  In  that  they  are  much  easier 
to  specify  and  to  Implement.  For  determining  the  system  state,  Butterworth 
Identifies  a  sequential  procedure  that  is  optimal  among  all  procedures, 
sequential  or  not.  For  determining  the  component  states,  the  optimal  test 
procedure  Is  sequential  only  for  some  special  cases,  among  them  parallel  and 
series  systems.  For  other  k-out-of-n  systems  the  optimal  procedure  seems  to  be 
neither  easily  Identifiable  nor  easily  Implementable.  For  more  complex  systems 
the  prospects  are  even  worse.  Our  emphasis  Is,  therefore,  upon  Identifying 
good.  If  not  optimal,  test  procedures  for  general  coherent  systems. 

2.  Sequential  Test  Procedures 

The  Information  required  to  specify  a  sequential  test  procedure  is  minimal 
—  just  an  ordered  list  of  the  system's  components.  In  contrast,  the  amount  of 
Information  necessary  to  specify  a  nonsequential  policy  may  be  quite  large, 
because  the  choice  of  the  n^  component  to  be  tested  msy  depend  arbitrarily 


upon  the  2°  1  poeelble  results  of  the  preceding  n-1  tests*  Unless  there  Is 
e  simple  rule  relating  this  choice  to  the  previous  test  results ,  such  e 
procedure  night  require  en  Inordinate  amount  of  Information  to  be  specified 
end*  thus,  be  practically  impossible  to  Implement .  For  thla  reaeon  we  will 
focus  our  attention  upon  sequential  test  procedures* 

Consider  first  the  situation  where  testing  stops  as  soon  as  e  felled 
component  Is  Identified.  In  thla  case  the  optimal  policy  Is  sequential, 
because  one  knows  In  advance  that  all  components  tested  prior  to  the  lest  one 
will  be  working,  end  thua,  the  Information  known  et  any  etege  of  testing  can  be 

preepedfled.  Let  <■  (s^ . x^)  be  a  permutation  of  1,  ...»  n.  The 

expected  coat  of  the  sequential  policy  determined  by  *  la  given  by 

n-1 

?(x)  -  c[l  +  J  Pr{ components  *. ,  ...»  x  are  functional 
k-1  i  it 

at  time  of  eystem  failure}] 

Thua,  In  principle.  If  one  can  compute  the  above  probabilities,  one  can 
determine  the  optimal  teet  policy.  In  practice,  however,  these  joint  condi¬ 
tional  failure  probabilities  are  very  hard  to  determine,  and  the  number  of 
permutations  to  be  evaluated  Is  very  large. 

An  alternative  Is  to  choose  a  test  policy  heurlstlcally.  One  way  to  do  eo 
la  to  compute  the  conditional  failure  probability  of  each  component  given 
system  failure,  and  then  to  test  the  component  having  the  highest  conditional 
failure  probability  first,  the  one  with  the  second  highest  conditional  failure 
probability  second,  and  so  on.  Although  this  procedure  Is  not  guaranteed  to 
produce  an  optimal  test  policy.  It  Is  a  reasonable  heuristic. 
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How  consider  tho  cut  wboro  tooting  eontlnuoo  until  o  critically  foiled 


component  lo  Identified*  Here  teetlng  often  eontlnuoo  beyond  the  first 
detected  f el lure,  so  the  inforaation  known  st  eny  stsge  of  testing  esnnot  be 
specified  In  advance .  Thus,  the  optlael  policy  is  generslly  not  sequential, 
sod  consequently.  Is  thst  auch  border  to  deternlne*  The  heuristic  procedure 
suggested  ebove  aey  slso  be  spplied  here,  substituting  crlticsl  fsllure 
probsbllltles  In  the  selection  criterion.  As  before,  the  policy  it  produces 
will  not  necesssrily  be  optlael. 

He  now  turn  to  the  question  of  how  to  effectively  coapute  the 
probsbllltles  on  which  these  heuristics  ere  based. 

3.  Colculstlon  of  Conditions!  Tellurs  Probsbllltles  for  Heuristic  Procedures 
Consider  s  coherent  eystea  (C,  #)  of  n  coaponents  with  alnlasl  peth 
sets  ,  ...,  Py.  Assuas  that  components  fell  Independently  of  one  another, 
end  let  P^(*)  be  the  tlae-to-f allure  distribution  of  coaponent  1.  He  assui 
e  proportional  hsssrds  aodel  for  coaponent  felluree,  taking 

-X  l<t) 

F4(t)  -  e 

where  ^  Is  e  proportionality  constant.  A  practical  consequence  of  this 
assuaptlon  Is  that  instead  of  specifying  n  different  distribution  functions 
one  need  only  specify  e  single  hasard  function  together  with  n  -  1  propor¬ 
tionality  constants.  In  fact,  the  calculations  In  this  paper  will  not  even 
depend  upon  the  hasard  function  but  only  the  proportional  hsssrds. 

Let  p  •  F(t)  •  (1  -  T. (t),  ...»  1  -  F  (t))  denote  the  vector  of  coapo- 

i  n 

aeat  reliabilities  at  a  given  tlaa  t.  Let  h(£)  be  the  systea  reliability 


and  let  1^(1;  2)  ■  8*»(£)/®P|  b«  the  Blrnbaum  reliability  importance  of 
component  1.  (Blrnbaum  [1969]). 

Finally,  lat 

A^(i)  ■  Pr{ component  1  is  failed  at  tins  of  ay • tea  failure) 
and 

0^(1)  -  Pr{coaponant  i  is  critically  failed  at  tine  of  ayataa 
failure)  . 


A  quantity  related  “  *h(1>  — 

(Berlov  and  Proa chan  [1975]), 

?h(i)  -  Pr (component  1  causes  system  failure) 

% 

(A  component  is  said  to  causa  ayataa  failure  if  the  failure  of  the  ayataa  and 
the  component  coincide.)  Identifying  vhlch  component  caused  system  failure 
cannot  be  determined  by  tests  that  only  reveal  whether  or  not  a  component  is 
failed  and  nothing  about  how  recently  any  failures  may  have  occurred. 
Monethmless,  the  causal  probabilities  can  be  computed.  Barlow  and  Proschan 
[1975]  give  the  following  formulas  for  P^i): 

m 

V1)  -  /  Vi;  S*>«*i(0  • 

In  the  case  of  proportional  hasards,  a  simple  change  of  variable  reduces  the 


C.  (1)  is  the  Barlow-Pros chan  importance 


formula  to 


I 


(1) 


1  .  X  -1 

ph(i)  “  [  V1*  A  V  dP 

where 


A  slallar  foraula  cen  be  developed  for  A^fi).  Noting  thet  Ah(i)  Is  the 
probeblllty  thet  the  systea  functions  et  least  as  long  as  coaponent  i  does , 


Ah(i)  -  /  h(l1,  F(t))dP1(t) 

•  .  X  -1 

-  /  h(i  .  j£)x P  1  dP 

o  1  x 

m  \  \  X  “1 

-  /  [h(r-)  +  P1ih(ii  iA)jxlP  1  dp  .  (2) 

A  coaparable  formula  for  C^(i)  is  wore  coup Heated.  Suppose  peth  sets 

•••»  contains  coaponent  1  and  . Pr  do  not.  Let  x  denote 

the  coherent  systea  with  path  sets  Pj^  -  (i),  -  {i},  ...»  Pfc  -  {1}  and  let 

T)  denote  the  coherent  systea  with  path  sets  P^,  ...,  Pf.  Coaponent  1  is 
critically  failed  if  1)  it  causes  systea  failure,  or  2)  when  the  systea  ulti- 
aately  falls,  there  is  e  ainlaua  path  containing  coaponent  1  having  the  pro¬ 
perty  that  all  other  coaponenta  are  functioning.  Thus, 


V1)  ■  Ph(i)  +  r  Pr^i  <  t)  •  Pr{n(x>  -  t  and  x(X)  >  t}dt 


Let  Z1(«,  t)  -  Pr{r)(X)  <  •  end  t(X)  >  t>  end  let  s^t)  -  Z^e,  t)|>-t. 
In  the  ceee  of  proportional  hazards,  i^Ct)  can  be  expreaeed  as  a  polynomial  In 
p,  81(p)>  and  0^(1)  reduced  to 

(^(1)  -  Ph(l)  +  /"  P4(t)  •  *1(t)dt 
1  Xi 

"  Ph(i)  *  l  (1  ~  p  >  *  *i<P>dP  •  (3) 

Bquatlone  (1),  (2)  and  (3),  in  principle,  allow  the  three  failure-related  pro¬ 
babilities  Ph(l),  A^(l)  and  0^(1)  to  be  calculated.  However,  theee 
formulae  nay  not  be  eaay  to  evaluate  directly.  The  Integrande  are  polynonlale 
In  p,  but  determining  the  coefflciente  of  theee  polynonlale  can  be  computa¬ 
tionally  complex  and  very  time  coneuttlng  If  the  ayetem  la  very  large.  Barlow 
and  Proachan  [1975]  have  euggeeted  a  Monte  Carlo  approach  to  approximate  the 
Integral  In  (1).  A  similar  procedure  could  be  used  to  evaluate  (2)  and  (3). 

As  an  alternative,  we  will  develop  an  analytic  methodology  which  exploits  the 
existence  of  modules  within  the  coherent  system  to  facilitate  the  computations. 

A  module  of  a  coherent  system  (C,  0)  Is  a  subset  A  of  components 
organised  Into  some  coherent  substructure  x  such  that  the  system  performance 
depends  only  upon  the  components  In  A  through  the  performance  of  x»  l.e., 

0<x)  -  <KX(£A),  , 
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A  c 

where  x  denotes  a  vector  with  components  x^,  1  e  A,  and  A  ■  C  -  A.  The 

structure  function  4>  is  called  the  organising  structure.  Bimbaua  [1969] 

shows  that  the  overall  importance  of  a  component  within  a  nodule  is  the  product 

of  its  importance  within  the  nodule  and  the  importance  of  the  nodule  within  the 

organizing  structure,  i.e., 

I*(i)  -  I*(i)  •  I*(M)  .  (4) 


This  formula  allows  the  computation  of  the  Integrand  in  (2)  to  proceed  on  a 
nodule-by-nodule  basis  instead  of  having  to  consider  the  entire  systen  at 
once.  This  is  a  crucial  advantage,  since  the  computations  Involved  in 
determining  h(£>,  Ih(i;  j>),  or  gj(p)  directly  grow  exponentially  with  the 
number  of  min  paths. 

In  order  to  calculate  C.  (i)  efficiently,  a  modular  decomposition  formula 

n 

for  g^(p)  is  needed. 

Proposition; 


g*<p>  -  g£<P>  •  i£(M;  p)  +  i*U;  p)  •  gj(p)  (5) 

i 

Proof:  Suppose  4*  is  defined  by  min  psths  P^,  ...»  P#.  Suppose  P^ . P? 

(b 

contain  module  M  and  ...,  P#  do  not.  Let  denote  the  structure 

function  defined  by  min  paths  P^  -  {M},  ...,  P^  -  {M>,  and  let  be  the 
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i 

.  4 

« 


4  4  Y  v 

structure  function  defined  by  ?A+1,  ...»  P#.  Define  t*,  and 

similarly.  These  quantities  are  interrelated  as  follows: 


*  Y  ¥ 

Xt  "  XM  *  Ti 

-  "J + ’I!  •  ”i  -  \  •  ’il  •  "J 


Let  denote  the  lifetime  of  component  i,  and  let 


XjCt) 


1  if  T  >  t 
0  otherwise  • 


zj(«,  t)  -  Pr{n4(X(s»  -  0  and  tJ(X(t))  -  l} 

-  Pr[{Tj(X(s))  -  0}  and  {tJ(X(s)>  -  0  or  nj  (X(s» 

and  {tJ(X(t»  -  1}  and  {xJ(X(t» 


For  s  <  t. 


-  Pr{t£(X(s))  -  0  and  t*(X(t))  -  l} 
•  Pr{TiJ(X(s))  -  0  and  t*(X(t))  -  l} 


zj(s,  t)  •  Z*(e,  t) 


*J(t) 


-  *J(t)  •  zj(t,  t)  +  zj(t,  t)  •  s*(t) 
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Since  for  any  coherent  structure, 


Z1(t,  t)  -  Pr{T)(X(t))  -  0  and  r(X(t)) 


“  h(l1#  p-)  -  h(0if  p^) 


“  Ih(i,  P-)  , 


the  result  is  established. 


4.  Computational  Notes  and  an  Example 


A  computer  program  to  determine  conditional-failure  and  critical-failure 
probabilities  has  been  developed  by  and  is  available  from  the  authors.  This 
program  is  written  in  ANSI-Standard  FORTRAN  and  exploits  modules  to  efficiently 
compute  A^(i)  and  (  i  ) .  The  program  only  allows  integer  values  for  the 
X^.  It  is  easy  to  show  that  if  one  multiplies  each  X^  by  a  common  positive 
constant  no  change  to  A^i)  or  C^(i)  results.  Thus,  requiring  Integer 
values  for  the  X^  is  not  a  very  severe  restriction.  The  system  structure  is 
entered  in  a  "bottom-up"  fashion,  with  the  most  elementary  modules  input  first. 
The  inclusion-exclusion  method  is  used  to  compute  the  reliability  h(j>)  of 
each  module  immediately  after  it  is  input.  As  each  h(£>  is  computed,  1^ (j») 
and  g^(p)  are  determined  also.  As  modules  are  input,  equations  (4)  and  (5) 
are  used  to  determine  the  overall  importance  and  critical  polynomials.  Once 
all  modules  have  been  input,  equations  (2)  and  (3)  are  used  to  compute  A. (1) 

n 

and  0^(1).  Thus,  the  heuristic  sequential  Inspection  policies  for  the  two 
optimization  criteria  may  be  obtained.  Special  methods  of  encoding  polynomials 
and  of  sequencing  through  path  subsets  are  employed  to  improve  speed. 


I 


To  Illustrate  the  efficiency  of  exploiting  nodules  in  determining  these 


failure  probabilities,  consider  the  following  example  system. 


The  circled  numbers  are  the  failure  ratea.  The  table  below  gives  the 
conditional-failure  and  critical-failure  probabilities. 


i 

v*> 

VD 

1 

0.605 

0.350 

2 

0.385 

0.292 

3 

0.568 

0.213 

4 

0.690 

0.213 

5 

0.729 

0.380 

6 

0.635 

0.439 

7 

0.336 

0.046 

8 

0.368 

0.238 

9 

0.777 

0.238 

LO 

0.350 

0.122 

Optimisation  Criterion _ Heuristic  Inspection  Sequence 


To  compute  the  above  directly  (without  breaking  down  the  sy8tem  into  modules) 
required  10.7  CPU-seconds  on  a  CYBER  73.  The  most  extensive  modular  decomposi¬ 
tion  of  the  system  would  be  to  consider  components  3  and  4  as  one  module , 
component  1  and  the  first  module  as  a  second  module,  components  8  and  9  as  a 
third  module,  and  components  2,  5,  6,  7,  10  and  modules  2  and  3  as  a  fourth 
module.  Utilizing  this  modular  decomposition  reduced  the  computation  time  by 
96Z  —  to  0.4  CPU-seconds. 

Although  this  computer  program  was  developed  on  a  large  computer,  it 
should  be  easily  adaptable  to  microcomputers.  Almost  all  of  the  calculations 
Involve  Integer  arithmetic  (assuming  integer  values  for  the  X^).  Only  the 
final  polynomial  integrations  require  floating  point  division.  Overall  memory 
requirements  can  be  large,  because  each  component  has  associated  with  it  two 
;  p— )  and  g^p) ,  each  of  which  may  have  hundreds  of  terns. 
In  addition,  each  module  has  associated  with  it  a  reliability  polynomial, 
h(p^).  However,  only  small  portions  of  this  information  are  needed  at  any 
given  time,  and  so  it  can  be  stored  on  relatively  slow- speed  media,  such  as 
floppy  disks. 

5.  Conclusions 

Locating  faulty  components  in  a  failed  system  may  require  costly 
inspections.  The  total  expected  cost  to  find  a  failed  or  critically  failed 
component  can  be  significantly  Influenced  by  the  order  in  which  components  are 
tested.  Determining  the  optimal  inspection  sequence  is  impractical  for  all  but 
the  simplest  system  structures,  so  heuristic  procedures  must  be  used. 


polynomials ,  1^(1 


Using  the  proposed  heuristic*  end  decomposing  the  system  structure  Into 
suitebly  sized  modules  allows  simple  and  intuitive  test  procedures  to  be 
developed  with  a  minimum  amount  of  computational  effort* 


t 
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